Biclustering microarray data by Gibbs sampling

نویسندگان

  • Qizheng Sheng
  • Yves Moreau
  • Bart De Moor
چکیده

MOTIVATION Gibbs sampling has become a method of choice for the discovery of noisy patterns, known as motifs, in DNA and protein sequences. Because handling noise in microarray data presents similar challenges, we have adapted this strategy to the biclustering of discretized microarray data. RESULTS In contrast with standard clustering that reveals genes that behave similarly over all the conditions, biclustering groups genes over only a subset of conditions for which those genes have a sharp probability distribution. We have opted for a simple probabilistic model of the biclusters because it has the key advantage of providing a transparent probabilistic interpretation of the biclusters in the form of an easily interpretable fingerprint. Furthermore, Gibbs sampling does not suffer from the problem of local minima that often characterizes Expectation-Maximization. We demonstrate the effectiveness of our approach on two synthetic data sets as well as a data set from leukemia patients.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Functional grouping of yeast genes via biclustering microarray data.

Biclustering algorithm on Gibbs sampling strategy is a recruit in the field of the analysis of gene expression data of microarray experiments. Its feasibility and validity still need to be researched not only for synthetic datasets but also for real datasets. Here we investigated a biclustering algorithm on a microarray dataset of Yeast genome through building a database for storing microarray ...

متن کامل

GEMS: a web server for biclustering analysis of expression data

The advent of microarray technology has revolutionized the search for genes that are differentially expressed across a range of cell types or experimental conditions. Traditional clustering methods, such as hierarchical clustering, are often difficult to deploy effectively since genes rarely exhibit similar expression pattern across a wide range of conditions. Biclustering of gene expression da...

متن کامل

Discovering Relevance-Dependent Bicluster Structure from Relational Data

In this paper, we propose a statistical model for relevance-dependent biclustering to analyze relational data. The proposed model factorizes relational data into bicluster structure with two features: (1) each object in a cluster has a relevance value, which indicates how strongly the object relates to the cluster and (2) all clusters are related to at least one dense block. These features simp...

متن کامل

Nonparametric Bayesian Biclustering

We present a probabilistic block-constant biclustering model that simultaneously clusters rows and columns of a data matrix. All entries with the same row cluster and column cluster form a bicluster. Each cluster is part of a mixture having a nonparametric Bayesian prior. The number of biclusters is therefore treated as a nuisance parameter and is implicitly integrated over during simulation. M...

متن کامل

Latent Dirichlet Bayesian Co-Clustering

Co-clustering has emerged as an important technique for mining contingency data matrices. However, almost all existing coclustering algorithms are hard partitioning, assigning each row and column of the data matrix to one cluster. Recently a Bayesian co-clustering approach has been proposed which allows a probability distribution membership in row and column clusters. The approach uses variatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 19 Suppl 2  شماره 

صفحات  -

تاریخ انتشار 2003